\section{Related Work}
\label{sec:related}

Recently, much work focuses on mining evolving
data~\cite{hulten01time,street01streaming,chen02regres,guha00clustering,ensembleoverfitting,streamensemble,loadstar,loadstardemo,wangicdm05}.
Much effort has been made to adapt traditional classifiers for
evolving
data~\cite{hulten01time,street01streaming,chen02regres,guha00clustering,ensembleoverfitting,wangicdm05}.
Most work, however, falls into the category of \emph{chasing the
  snapshots}, that is, it learns and re-learns models in the evolving
data, and then it uses a single snapshot or several snapshots together
to make predictions, instead of trying to reveal the big picture, that
is, the underlying data generating mechanism.


These approaches have negative impacts on the accuracy of a stream
classifier. Besides, model training is often a time consuming, offline
process.  To keep up with the high data throughput in testing, we
create impromptu models of low quality. In particular, it is hard to
find out what data an up-to-date model should rely on. A large set of
data may include changing concepts, and a small set will cause model
over-fitting.

The possibility that history always repeats itself, that is, only a
limited number of concepts exists in an endless data stream, has
prompted researchers to devise more efficient learning schemes. The
RePro approach~\cite{proactive} seeks to group two base models if they
tend to agree with each other in classifying a training dataset. Thus,
large amount of historical snapshots are reduced to a much small
number of concepts. A problem in this pioneering approach is that we
cannot afford to perform pair-wise comparison between every two models
to find ``equivalent classes'' of concepts, or relying on an
arbitrary training dataset as a similarity measure. 

%% \cite{charu-framework03}

%% \cite{shixi08}

The idea of developing a meta-model from a set of base models is not
new. For instance, meta-learning~\cite{ml}, query by
committee~\cite{qbc}, and other ensemble
methods~\cite{dietterich00ensemble} have been studied extensively.
The purpose of these approaches is to integrate results from
multiple learning systems in order to reduce the inductive bias of
individual learners, so that the meta-model is more accurate.
However, they do not deal with the challenges of evolving data.

%%% Local Variables:
%%% mode: latex
%%% TeX-master: "vem-icde09"
%%% End:
